DOC: update the pandas.DataFrame.to_sql docstring #20126

jazzmuesli · 2018-03-10T13:26:04Z

Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):

PR title is "DOC: update the docstring"
The validation script passes: scripts/validate_docstrings.py <your-function-or-method>
The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
The html version looks good: python doc/make.py --single <your-function-or-method>
It has been proofread on language by another sprint participant

Please include the output of the validation script below between the "```" ticks:


################################################################################
##################### Docstring (pandas.DataFrame.to_sql)  #####################
################################################################################

Write records stored in a DataFrame to a SQL database.

This function inserts all rows of the dataframe into the given
 table and recreates it if if_exists='replace'. Databases supported by
 SQLAlchemy or DBAPI2 are also supported.

Parameters
----------
name : string
    Name of SQL table.
con : SQLAlchemy engine or DBAPI2 connection (legacy mode)
    Using SQLAlchemy makes it possible to use any DB supported by that
    library. If a DBAPI2 object, only sqlite3 is supported.
schema : string, default None
    Specify the schema (if database flavor supports this). If None, use
    default schema.
if_exists : {'fail', 'replace', 'append'}, default 'fail'
    Accepted values:
    - fail: If table exists, do nothing.
    - replace: If table exists, drop it, recreate it, and insert data.
    - append: If table exists, insert data. Create if does not exist.
index : boolean, default True
    Write DataFrame index as a column.
index_label : string or sequence, default None
    Column label for index column(s). If None is given (default) and
    `index` is True, then the index names are used.
    A sequence should be given if the DataFrame uses MultiIndex.
chunksize : int, default None
    If not None, then rows will be written in batches of this size at a
    time.  If None, all rows will be written at once.
dtype : dict of column name to SQL type, default None
    Optional specifying the datatype for columns. The SQL type should
    be a SQLAlchemy type, or a string for sqlite3 fallback connection.

Returns
--------
    None

See Also
--------
pandas.read_sql_query : read a DataFrame from a table

Examples
--------
>>> from sqlalchemy import create_engine
>>> engine = create_engine('sqlite:///example.db', echo=False)
>>> df = pd.DataFrame({'name' : ['User 1', 'User 2', 'User 3']})
>>> # create a table from scratch with 3 rows
>>> df.to_sql('users', con=engine, if_exists='replace')
>>> df1 = pd.DataFrame({'name' : ['User 4', 'User 5']})
>>> # 2 new rows inserted
>>> df1.to_sql('users', con=engine, if_exists='append')
>>> # table will be recreated and 5 rows inserted
>>> df = pd.concat([df, df1], ignore_index=True)
>>> df.to_sql('users', con=engine, if_exists='replace')
>>> pd.read_sql_query("select * from users",con=engine)
   index    name
0      0  User 1
1      1  User 2
2      2  User 3
3      3  User 4
4      4  User 5

################################################################################
################################## Validation ##################################
################################################################################

Docstring for "pandas.DataFrame.to_sql" correct. :)

If the validation script still gives errors, but you think there is a good reason
to deviate in this case (and there are certainly such cases), please state this
explicitly.

ghost · 2018-03-10T13:28:36Z

pandas/core/generic.py

+
+        Examples
+        --------
+        >>> import pandas as pd


pandas import is not recommended in examples.

ghost · 2018-03-10T13:28:57Z

pandas/core/generic.py

@@ -1865,17 +1865,21 @@ def to_sql(self, name, con, schema=None, if_exists='fail', index=True,
        """
        Write records stored in a DataFrame to a SQL database.

+        This function inserts all rows of the dataframe into the given


would be good to know which databases are supported and how.

ghost · 2018-03-10T13:32:34Z

pandas/core/generic.py

+        >>> gen_users = lambda ids: {"id": ids, "name" : gen_names(ids)}
+        >>> df=pd.DataFrame(gen_users(list(range(3))))
+        >>> # create a table from scratch
+        >>> df.to_sql('users',con=engine,if_exists='replace')


some example lines are not compatible with PEP8

ghost · 2018-03-10T13:33:43Z

pandas/core/generic.py

+        >>> engine = create_engine('sqlite:///example.db', echo=False)
+        >>> gen_names = lambda ids: ["User" + str(x) for x in ids]
+        >>> gen_users = lambda ids: {"id": ids, "name" : gen_names(ids)}
+        >>> df=pd.DataFrame(gen_users(list(range(3))))


maybe it's a bit more straightforward to generate dataframe without other functions:
i.e. df=pd.DataFrame(['User 1', 'User 2', 'User 3'])

TomAugspurger · 2018-03-10T13:36:42Z

pandas/core/generic.py

        con : SQLAlchemy engine or DBAPI2 connection (legacy mode)
            Using SQLAlchemy makes it possible to use any DB supported by that
            library. If a DBAPI2 object, only sqlite3 is supported.
        schema : string, default None
            Specify the schema (if database flavor supports this). If None, use
            default schema.
        if_exists : {'fail', 'replace', 'append'}, default 'fail'
+            Accepted values:


FWIW, I find it clearer without the line here, though I know the script complains when it isn't present :) cc @datapythonista

TomAugspurger · 2018-03-10T13:36:58Z

pandas/core/generic.py

@@ -1892,6 +1897,29 @@ def to_sql(self, name, con, schema=None, if_exists='fail', index=True,
            Optional specifying the datatype for columns. The SQL type should
            be a SQLAlchemy type, or a string for sqlite3 fallback connection.

+        Returns


The Returns section can be ommitted if ther's no return value.

python scripts/validate_docstrings.py pandas.DataFrame.to_sql complains:
Errors found:
No returns section found

TomAugspurger · 2018-03-10T13:37:26Z

pandas/core/generic.py

+
+        See Also
+        --------
+        pandas.io.sql.to_sql : this function will be called.


I don't think that method is public. I think just link to pandas.read_sql

TomAugspurger · 2018-03-10T13:38:12Z

pandas/core/generic.py

+        --------
+        >>> from sqlalchemy import create_engine
+        >>> engine = create_engine('sqlite:///example.db', echo=False)
+        >>> gen_names = lambda ids: ["User" + str(x) for x in ids]


Any chance you could simplify this? It's a bit dense.

Maybe just make a simple df, display it. then show the insert. And maybe show a pd.read_sql to view the result.

pep8speaks · 2018-03-10T15:39:25Z

Hello @jazzmuesli! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on March 12, 2018 at 22:15 Hours UTC

jreback · 2018-03-10T15:39:12Z

pandas/core/generic.py

@@ -1865,17 +1865,22 @@ def to_sql(self, name, con, schema=None, if_exists='fail', index=True,
        """
        Write records stored in a DataFrame to a SQL database.

+        This function inserts all rows of the dataframe into the given
+         table and recreates it if if_exists='replace'. Databases supported by


can add a link to SQLAlchemy in References

jreback · 2018-03-10T15:39:48Z

pandas/core/generic.py

+        >>> engine = create_engine('sqlite://', echo=False)
+        >>> df = pd.DataFrame({'name' : ['User 1', 'User 2', 'User 3']})
+        >>> # create a table from scratch with 3 rows
+        >>> df.to_sql('users', con=engine, if_exists='replace')


can you add blank lines between cases
also for comments, don't use the leading '>>>'

jreback · 2018-03-10T15:45:50Z

pandas/core/generic.py

+
+        See Also
+        --------
+        pandas.read_sql_query : read a DataFrame from a table


just to pandas.read_sql

* Added more examples * Reworded extended * Reformed if_exists * Added Raises * Removed Returns * Added DBABI2 ref

TomAugspurger · 2018-03-12T22:16:07Z

Made some updates if anyone has a chance to review @jazzmuesli.

TomAugspurger · 2018-03-13T20:27:35Z

Thanks @jazzmuesli !

preich added 4 commits March 10, 2018 11:16

added an example for to_sql

13fd3bd

removed sqlalchemy/sqlite output; fixed docstring

b653565

Merge remote-tracking branch 'upstream/master'

0b1eb27

fixed line width

c48870f

ghost reviewed Mar 10, 2018

View reviewed changes

pandas/core/generic.py Outdated

Examples

--------

>>> import pandas as pd

Copy link

ghost Mar 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pandas import is not recommended in examples.

ghost reviewed Mar 10, 2018

View reviewed changes

removed import pandas

9cce9e9

ghost reviewed Mar 10, 2018

View reviewed changes

spaces added in the example

b4dd9b1

TomAugspurger reviewed Mar 10, 2018

View reviewed changes

preich added 2 commits March 10, 2018 13:45

removed id column, added read_sql_query to the example

9af573d

no need to create files

be02b35

remove pointless Accepted values

62e9cb0

jreback requested changes Mar 10, 2018

View reviewed changes

jreback added Docs IO SQL to_sql, read_sql, read_sql_query labels Mar 10, 2018

rollback

e315f67

jreback requested changes Mar 10, 2018

View reviewed changes

preich and others added 6 commits March 10, 2018 16:05

added references

15a8938

comments added

c82208b

comments added

7c503b6

comments added

5668eec

use read_sql

9e1bec4

Updates

fa9cc36

* Added more examples * Reworded extended * Reformed if_exists * Added Raises * Removed Returns * Added DBABI2 ref

TomAugspurger added this to the 0.23.0 milestone Mar 13, 2018

TomAugspurger merged commit 50b2184 into pandas-dev:master Mar 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: update the pandas.DataFrame.to_sql docstring #20126

DOC: update the pandas.DataFrame.to_sql docstring #20126

jazzmuesli commented Mar 10, 2018 •

edited

Loading

ghost Mar 10, 2018

ghost Mar 10, 2018

ghost Mar 10, 2018

ghost Mar 10, 2018

TomAugspurger Mar 10, 2018

TomAugspurger Mar 10, 2018

jazzmuesli Mar 10, 2018

TomAugspurger Mar 10, 2018

TomAugspurger Mar 10, 2018

pep8speaks commented Mar 10, 2018 •

edited

Loading

jreback Mar 10, 2018

jreback Mar 10, 2018

jreback Mar 10, 2018

TomAugspurger commented Mar 12, 2018

TomAugspurger commented Mar 13, 2018

DOC: update the pandas.DataFrame.to_sql docstring #20126

DOC: update the pandas.DataFrame.to_sql docstring #20126

Conversation

jazzmuesli commented Mar 10, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pep8speaks commented Mar 10, 2018 • edited Loading

Comment last updated on March 12, 2018 at 22:15 Hours UTC

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomAugspurger commented Mar 12, 2018

TomAugspurger commented Mar 13, 2018

jazzmuesli commented Mar 10, 2018 •

edited

Loading

pep8speaks commented Mar 10, 2018 •

edited

Loading